Conversation
…-validator into unit_test_fix
unit testing fix
3.5.0 released
Merge 3.5.1 into 3.6.0
Introduces a new PV puller v2 workflow with src/pv_puller_v2.py, updates configuration and constants to support the new service type, and adds MongoDao logic for upserting property PVs. Integrates the new puller into the validator controller, updates API client for batch data element retrieval, and extends MongoDB scripts and config handling for the new STS API endpoint.
Replaced references to 'CDE' with 'property' throughout pv_puller_v2.py to improve clarity and align with updated data model. Updated function names, variable names, log messages, and removed unused CDE-specific code.
…odel and property instead of cde code Introduces insert_concept_codes_v2 in MongoDao to prevent duplicate concept code insertion and updates PVPullerV2 to use this new method. Also modifies compose_concept_code_record to use (model, property, value, concept_code) as the unique key for concept codes.
Renamed variables and log messages in mongo_dao.py for clarity, changing references from 'cde' to 'property'. Improved error handling in pv_puller_v2.py by raising an exception when no model is configured for property PV pulling.
Add sts v2 api support and property PV handling
Introduce support for STS v2 property-level permissible values and update codepaths to retrieve and persist them. Add new STS API v2 endpoint config (sts_api_one_url_v2) and update mongo DB script with the v2 URL template. Extend DataModel with get_model_version and get_data_commons. Update MongoDao to upsert/get property PVs by property/model/version and to lookup concept codes by property+model. Replace legacy PV puller usage with a new pv_puller_v2 that fetches property PVs (get_pv_by_property_version, get_all_pvs_by_version), processes results, and saves PVs, synonyms, and concept codes to DB. Update metadata validator to use the new PV retrieval flow (model/version-based lookup and STS v2 fallback) and adjust controller to call the v2 puller. Miscellaneous logging/message tweaks and small refactors to parameter names.
Add in-memory caches and refactor permissive-value checks to reduce DB calls and simplify logic. - src/common/mongo_dao.py: add props and concept_codes dicts and cache results in get_property_permissible_values and get_concept_code_by_pv to avoid repeated Mongo queries. - src/common/utils.py: add CDE_PERMISSIVE_VALUES import and new has_permissive_value(prop) helper to centralize permissive-value detection. - src/metadata_validator.py: import and use has_permissive_value to replace repetitive permissive-value checks and cleanup related code paths. - src/config.py: simplify STS resource assignments by using sts_resource.get(...) directly. These changes aim to improve performance (fewer DB reads), reduce duplicated code, and make permissive-value handling clearer.
…, and use a single layer dict with concat keys Migrate permissible-value handling from CDE-centric to property-centric (STS v2). Added PROPERTY_PERMISSIBLE_VALUES and PROPERTY_TERM constants; updated utils and metadata_validator to use property terms and a not_found_property flag. Refactored mongo DAO to store/fetch property PVs and concept codes using composite cache keys (model_version_property) and to use the new permissible-values field. Removed legacy pv_puller.py and updated pv_puller_v2.py to use property-focused naming, API endpoints and extraction logic. Adjusted imports (config/validator) accordingly.
Add STS v2 property PV support and retrieval to the metadata validator
Introduce a new pytest suite for pv_puller_v2 covering PVPullerV2 behavior and helper functions. Tests use fixtures and extensive mocking (unittest.mock) to validate initialization, pulling flow (pull_property_pv_synonym_concept_codes), API interactions (retrieveAllPropertyViaAPI, get_pv_by_property_version, get_all_pvs_by_version), processing logic (process_sts_property_pv, extract_pv_list), record composition (compose_property_record, compose_synonym_record, compose_concept_code_record), and the top-level pull_pv_lists_v2 orchestration. The suite includes success cases, edge/error handling (empty results, upsert failures, API exceptions, KeyboardInterrupt), and an integration-style test for end-to-end flow with mocked API and Mongo DAO.
- batched metadata validation - added batched validation progress tracking in the validation document - added validation status details in records - added failed validation status to indicate bad inputs for validation .
- Add `Validate Metadata Batch` SQS message type for processing records in backend-defined chunks instead of fetching all records internally - Add `_process_metadata_batch` handler with per-batch error tracking and atomic finalization via `increment_completed_batches` (`$inc`, `$max`, `$push`) - Add `get_dataRecords_by_ids` to fetch records by explicit `_id` list from batch messages - Add guard filter (`metadataEnded: null`) and `$unset` of tracking fields to prevent double-finalization on message redelivery - Extend `update_validation_status` with `status_detail`, `unset_fields`, and `guard_filter` parameters - Extend `set_submission_validation_status` with `status_detail` for surfacing per-batch failure summaries - Extract `_initialize_for_validation` from `MetaDataValidator.validate` to share setup between batch and non-batch paths - Extract `_process_metadata_validation` and `_process_cross_submission` helpers from the main SQS loop - Replace `FAILED` constant with `STATUS_ERROR` across metadata and cross-submission validators - Fix `DATA_COLlECTION` typo to `DATA_COLLECTION` - Add circular parent reference guard in `get_file_consent_code` - Remove duplicate error/warning block in `validate_nodes` - Add comprehensive batch validation test suite (`test_metadata_batch_validation.py`) - Add `BATCHED_METADATA_VALIDATION_INTERFACE.md` documenting the SQS contract, DB schema, and error handling Co-authored-by: Cursor <cursoragent@cursor.com>
Resolve conflicts by accepting the simplified remote revision: - Remove guard_filter from update_validation_status - Default totalBatches to 1; simplify < 1 check - Remove batch summary detail for successful-but-errored batches - Always update submission status on last batch regardless of update_ok - Simplify _process_metadata_validation (remove None submission guard) - Remove tests for removed features (guard_filter, batch summary, missing totalBatches) Co-authored-by: Cursor <cursoragent@cursor.com>
- Restored failed status for validation records - include submission, validation, and batch ids in all applicable logs - add unit testing coverage to ensure the failed status is not written to the submission record - ensure that when the validation scope is New, the existing submission validation status is considered
3482 batched validation
Add comprehensive tests for pv_puller_v2
- parse the deleteOrphanedDataFiles flag from the delete metadata SQS messages - when data files are not deleted along with metadata, update the submission to add orphaned file errors
3576 update delete data records
… given by MDB api
CRDCDH-3642
Optimize metadata validation
Fixed deleting one parent will delete all parents of the same type issue
Removed outdated unit tests.
Update Python base image to 3.14.4-alpine3.23
Fixed all fixable CVES
Comment on lines
+14
to
+34
| name: Run tests and upload coverage | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Checkout repository | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| submodules: true | ||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: '3.12' | ||
| - name: Install dependencies | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install -r requirements.txt | ||
| pip install pytest-cov coveralls | ||
| - name: Run tests with coverage | ||
| run: | | ||
| pytest --cov=src --cov-report=xml --cov-report=term-missing --ignore=src/bento | ||
| - name: Coveralls GitHub Action | ||
| uses: coverallsapp/github-action@v2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.